-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create decoder for HTML entities #2563
base: main
Are you sure you want to change the base?
Conversation
pkg/decoders/html_entity.go
Outdated
|
||
if matched { | ||
decodableChunk := &DecodableChunk{ | ||
DecoderType: detectorspb.DecoderType_ESCAPED_UNICODE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a new decoder type?
I think we've reached the point where we should consider adding a One potential improvement might be to implement this as a handler and do identification of the whole file before decoding and chunking it out. |
I do worry about the impact of having too many decoders. At a minimum, having something like ahocorasick might be more efficient than checking
While I think identifying the mimetype of a file would be a great addition (and make way for other enhancements), I'm not sure how much it would help in this case. HTML, Markdown, and AsciiDoc files are obviously sources that would benefit, but HTML-encoded content can show up in weird places like config files, This decoder was act inspired by #1550; I found several live connection strings that were not detected by TruffleHog because they contained encoded
|
721ba1d
to
0512f94
Compare
6a98dcc
to
1180b27
Compare
729714d
to
d612f5b
Compare
d612f5b
to
4df5b0e
Compare
ca46f5d
to
45eb1ed
Compare
3676e9b
to
cb4c962
Compare
cb4c962
to
6083804
Compare
6083804
to
ac02868
Compare
Description:
This creates a decoder to handle HTML entities. Tests pass, but the implementation may not be the most efficient.
This fixes #2231.
Checklist:
make test-community
)?make lint
this requires golangci-lint)?